基于深度学习(DL)的降尺度已成为地球科学中的流行工具。越来越多的DL方法被采用来降低降水量的降水量数据,并在局部(〜几公里甚至更小)的尺度上产生更准确和可靠的估计值。尽管有几项研究采用了降水的动力学或统计缩减,但准确性受地面真理的可用性受到限制。衡量此类方法准确性的一个关键挑战是将缩小的数据与点尺度观测值进行比较,这些观察值通常在如此小的尺度上是无法使用的。在这项工作中,我们进行了基于DL的缩减,以估计印度气象部(IMD)的当地降水数据,该数据是通过近似从车站位置到网格点的价值而创建的。为了测试不同DL方法的疗效,我们采用了四种不同的缩小方法并评估其性能。所考虑的方法是(i)深度统计缩小(DEEPSD),增强卷积长期记忆(ConvlstM),完全卷积网络(U-NET)和超分辨率生成对抗网络(SR-GAN)。 SR-GAN中使用的自定义VGG网络是在这项工作中使用沉淀数据开发的。结果表明,SR-GAN是降水数据缩减的最佳方法。 IMD站的降水值验证了缩小的数据。这种DL方法为统计缩减提供了有希望的替代方法。
translated by 谷歌翻译
We present NusaCrowd, a collaborative initiative to collect and unite existing resources for Indonesian languages, including opening access to previously non-public resources. Through this initiative, we have has brought together 137 datasets and 117 standardized data loaders. The quality of the datasets has been assessed manually and automatically, and their effectiveness has been demonstrated in multiple experiments. NusaCrowd's data collection enables the creation of the first zero-shot benchmarks for natural language understanding and generation in Indonesian and its local languages. Furthermore, NusaCrowd brings the creation of the first multilingual automatic speech recognition benchmark in Indonesian and its local languages. Our work is intended to help advance natural language processing research in under-represented languages.
translated by 谷歌翻译
Cloud computing holds the promise of reduced costs through economies of scale. To realize this promise, cloud computing vendors typically solve sequential resource allocation problems, where customer workloads are packed on shared hardware. Virtual machines (VM) form the foundation of modern cloud computing as they help logically abstract user compute from shared physical infrastructure. Traditionally, VM packing problems are solved by predicting demand, followed by a Model Predictive Control (MPC) optimization over a future horizon. We introduce an approximate formulation of an industrial VM packing problem as an MILP with soft-constraints parameterized by the predictions. Recently, predict-and-optimize (PnO) was proposed for end-to-end training of prediction models by back-propagating the cost of decisions through the optimization problem. But, PnO is unable to scale to the large prediction horizons prevalent in cloud computing. To tackle this issue, we propose the Predict-and-Critic (PnC) framework that outperforms PnO with just a two-step horizon by leveraging reinforcement learning. PnC jointly trains a prediction model and a terminal Q function that approximates cost-to-go over a long horizon, by back-propagating the cost of decisions through the optimization problem \emph{and from the future}. The terminal Q function allows us to solve a much smaller two-step horizon optimization problem than the multi-step horizon necessary in PnO. We evaluate PnO and the PnC framework on two datasets, three workloads, and with disturbances not modeled in the optimization problem. We find that PnC significantly improves decision quality over PnO, even when the optimization problem is not a perfect representation of reality. We also find that hardening the soft constraints of the MILP and back-propagating through the constraints improves decision quality for both PnO and PnC.
translated by 谷歌翻译
Deep neural networks have emerged as the workhorse for a large section of robotics and control applications, especially as models for dynamical systems. Such data-driven models are in turn used for designing and verifying autonomous systems. This is particularly useful in modeling medical systems where data can be leveraged to individualize treatment. In safety-critical applications, it is important that the data-driven model is conformant to established knowledge from the natural sciences. Such knowledge is often available or can often be distilled into a (possibly black-box) model $M$. For instance, the unicycle model for an F1 racing car. In this light, we consider the following problem - given a model $M$ and state transition dataset, we wish to best approximate the system model while being bounded distance away from $M$. We propose a method to guarantee this conformance. Our first step is to distill the dataset into few representative samples called memories, using the idea of a growing neural gas. Next, using these memories we partition the state space into disjoint subsets and compute bounds that should be respected by the neural network, when the input is drawn from a particular subset. This serves as a symbolic wrapper for guaranteed conformance. We argue theoretically that this only leads to bounded increase in approximation error; which can be controlled by increasing the number of memories. We experimentally show that on three case studies (Car Model, Drones, and Artificial Pancreas), our constrained neurosymbolic models conform to specified $M$ models (each encoding various constraints) with order-of-magnitude improvements compared to the augmented Lagrangian and vanilla training methods.
translated by 谷歌翻译
This paper studies audio-visual suppression for egocentric videos -- where the speaker is not captured in the video. Instead, potential noise sources are visible on screen with the camera emulating the off-screen speaker's view of the outside world. This setting is different from prior work in audio-visual speech enhancement that relies on lip and facial visuals. In this paper, we first demonstrate that egocentric visual information is helpful for noise suppression. We compare object recognition and action classification based visual feature extractors, and investigate methods to align audio and visual representations. Then, we examine different fusion strategies for the aligned features, and locations within the noise suppression model to incorporate visual information. Experiments demonstrate that visual features are most helpful when used to generate additive correction masks. Finally, in order to ensure that the visual features are discriminative with respect to different noise types, we introduce a multi-task learning framework that jointly optimizes audio-visual noise suppression and video based acoustic event detection. This proposed multi-task framework outperforms the audio only baseline on all metrics, including a 0.16 PESQ improvement. Extensive ablations reveal the improved performance of the proposed model with multiple active distractors, over all noise types and across different SNRs.
translated by 谷歌翻译
仇恨言语分类一直是自然语言处理中的一个长期问题。但是,即使有许多仇恨言论检测方法,它们通常忽略了许多仇恨言论,因为它们在自然界中是隐含的。开发数据集以协助隐性仇恨言语分类的任务伴随着自己的挑战;困难是语言上的细微差别,改变了构成仇恨言论的定义以及劳动密集型的注释过程。这导致了可用于训练和测试此类系统的数据稀缺,当使用基于参数的变压器模型来解决该问题时,这会引起较高的差异问题。在本文中,我们探讨了各种优化和正则化技术,并开发了一种基于罗伯塔的新型模型,可实现最先进的性能。
translated by 谷歌翻译
推断线性关系是许多实证研究的核心。线性依赖性的度量应正确评估关系的强度,并符合对人群的有意义。 Pearson的相关系数(PCC)是双变量关系的\ textit {De-facto}量度,这两个方面都缺乏。估计的强度$ r $可能是由于样本量有限和数据非正态而可能错误的。在统计显着性测试的背景下,将$ p $值作为后验概率的错误解释导致I型错误 - 这是一个具有显着性测试的一般问题,扩展到PCC。同时测试多个假设时,此类错误会加剧。为了解决这些问题,我们提出了一种基于机器学习的预测数据校准方法,从本质上讲,该方法在预期的线性关系上进行了研究。使用校准数据计算PCC会产生校准的$ P $值,可以将其解释为后验概率以及校准的$ r $估计值,这是其他方法未提供的所需结果。此外,随之而来的对每个测试的独立解释可能会消除对多次测试校正的需求。我们提供了使用多个模拟和对现实世界数据的应用,有利于提出的方法的经验证据。
translated by 谷歌翻译
机器学习模型容易对远离培训分布的投入进行错误的预测。这阻碍了他们在自动驾驶汽车和医疗保健等安全至关重要应用中的部署。从单个数据点的训练分布转移的检测引起了人们的注意。已经提出了许多用于分发(OOD)检测的技术。但是在许多应用中,机器学习模型的输入形成了时间序列。时间序列数据中的OOD检测技术要么不利用序列中的时间关系,要么不提供任何检测保证。我们建议将偏离分布式时间均衡力偏差作为在时间序列数据中进行OOD检测的保形异常检测框架中的不符合度量度。导致提议的检测器编码,并保证在时间序列数据中进行虚假检测。我们通过在自动驾驶中实现计算机视觉数据集的最新结果来说明编码的功效。我们还表明,通过在生理步态感觉数据集上执行实验,可以将CODIT用于非视觉数据集中的OOD检测。代码,数据和训练有素的模型可在https://github.com/kaustubhsridhar/time-series-ood上找到。
translated by 谷歌翻译
通常通过过去的选择来告知机器学习中的评估,例如要使用哪些数据集或指标。该标准化可以使用排行榜对平等基础进行比较,但是随着出现更好的替代方案,评估选择变得不佳。这个问题在自然语言生成中尤其相关,该语言需要不断改善的数据集,指标和人类评估以提出确定性的主张。为了使遵循最佳模型评估实践更加容易,我们介绍了GEMV2。新版本的一代,评估和指标基准为数据集,模型和指标开发人员提供了模块化基础架构,以使彼此受益。GEMV2支持40种记录的数据集中51种语言。所有数据集的模型都可以在线评估,我们的交互式数据卡创建和渲染工具使得在Living Benchmark中添加新数据集变得更加容易。
translated by 谷歌翻译
对抗性训练(AT)及其变体在过去几年来改善对对抗性扰动和常见腐败的神经网络的鲁棒性方面取得了长足的进步。 AT及其变体的算法设计集中在指定的扰动强度$ \ epsilon $上,并且仅利用该$ \ epsilon $ -Robust模型的性能的反馈来改善算法。在这项工作中,我们专注于在$ \ epsilon $值的频谱上训练的模型。我们分析了三个观点:模型性能,中间特征精度和卷积滤波器灵敏度。在每种情况下,我们都会确定AT的替代改进,否则在单个$ \ epsilon $中并不明显。具体来说,我们发现,对于以某种强度$ \ delta $的pgd攻击,有一个型号以某种稍大的强度$ \ epsilon $,但没有更大的范围,可以概括它。因此,我们建议过度设计鲁棒性,我们建议以$ \ epsilon $略高于$ \ delta $的培训模型。其次,我们观察到(在各种$ \ epsilon $值中),鲁棒性对中间特征的精度,尤其是在第一层和第二层之后的精度高度敏感。因此,我们建议在防御措施中添加简单的量化,以提高可见和看不见的适应性攻击的准确性。第三,我们分析了增加$ \ epsilon $的每一层模型的卷积过滤器,并注意到第一和第二层的卷积过滤器可能完全负责放大输入扰动。我们通过在CIFAR-10和CIFAR-10-C数据集上使用Resnet和WideSnet模型进行实验,介绍我们的发现并证明我们的技术。
translated by 谷歌翻译